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Abstract 

We study a system in which N agents have to decide between two strategies 0i (i 6 
1 . . .N), for defection or cooperation, when interacting with other n agents (either spatial 
neighbors or randomly chosen ones). After each round, they update their strategy responding 
nonlinearly to two different information sources: (i) the payoff cii(9i, /,) received from the 
strategic interaction with their n counterparts, (ii) the fraction /$ of cooperators in this 
interaction. For the latter response, we assume social herding, i.e. agents adopt their strategy 
based on the frequencies of the different strategies in their neighborhood, without taking into 
account the consequences of this decision. We note that already determines the payoff, 
so there is no additional information assumed. A parameter £ defines to what level agents 
take the two different information sources into account. For the strategic interaction, we 
assume a Prisoner's Dilemma game, i.e. one in which defection is the evolutionary stable 
strategy. However, if the additional dimension of social herding is taken into account, we 
find instead a stable outcome where cooperators are the majority. By means of agent-based 
computer simulations and analytical investigations, we evaluate the critical conditions for 
this transition towards cooperation. We find that, in addition to a high degree of social 
herding, there has to be a nonlinear response to the fraction of cooperators. We argue that 
the transition to cooperation in our model is based on less information, i.e. on agents which 
are not informed about the payoff matrix, and therefore rely on just observing the strategy 
of others, to adopt it. By designing the right mechanisms to respond to this information, the 
transition to cooperation can be remarkably enhanced. 

Keywords: Prisoner's dilemma, social influence, mechanism design, nonlinear voter 
model 



1 Introduction 

Cooperation is an abundant phenomenon in biological and social systems, but in most game- 
theoretical approaches defection should be the rational strategy to choose. In order to solve this 
paradox, a vast number of literature has proposed modifications to the classical approach. They 
can be categorized along different directions: 

• changes of the payoff structure: lowering the costs of cooperation to make it more attractive 
in the first place is another form of "buying cooperation", 
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• extension of the time horizon: considering either repeated interaction, a memory for the 
strategy of the couterparts, calculating payoffs over a longer time interval, anticipating the 
future response to the own action, 

• considering spatial interaction: the threshold for the outbreak of cooperation is lowered if 
agents' interaction is constrained to their nearest or second-nearest neighbors (as opposed 
to randomly chosen agents), or if agents can migrate between different spatial domains 

We note that, particularly for biological systems, other additional mechanisms have been con- 
sidered |15j . such as altruism, the role of kinship relations, selection mechanisms on the group 
level, etc. 

In this paper, we add a new element to the discussion: social herding, i.e. a mechanism that 
does not take strategic considerations into account. Agents can observe the actions of others 
without knowing their consequence. In a game-theoretical setting this means they cannot adopt 
a certain strategy based on payoff considerations because the payoff structure is not known to 
them. Thus, agents are just left with knowing the frequency of strategies either globally or in 
their neighborhood, and they choose their own strategy only based on the information about 
the frequency of these strategies. In our model, we assume that any agent can consider both the 
payoff-related and the frequency-related information and weight their influence by a parameter £, 
which is assumed to be the level of social herding. Precisely, £ — > results in purely payoff-driven 
decisions, £ — > 1 in pure social herding. 

The case where social herding is dominant has been widely studied in binary opinion dynamics 
models |5l [2TJ [23l [29] where opinions are not necessarily related to payoff but rather to social 
norms. Thus, agents may adopt the opinion of a majority in order to minimize social conflicts, 
but they may not have a utility-based preference for either of these opinions. Instead their 
opinion results from a frequency-dependent decision. The so called linear voter model, where 
the probability to choose a particular opinion is directly proportional to its frequency is a very 
common example for this. It is known to result in consensus, i.e. the existence of only one opinion, 
asymptotically, but the outcome which opinion will dominate is not determined. In the mean- 
field limit, this model always results in consensus of either of the two opinions. Starting e.g. with 
the frequency f\ of opinions 8 = 1 and fo = l — fi with 8 = 0,the probability that the final 
consensus state is &i = 1 for all i is fi |10| . Hence, a simple majority rule of social herding, as 
expressed in the linear voter model, may not improve the situation for cooperation. Therefore, 
we turn to the class of nonlinear voter models |T8] in Sect. 2. As we will also show analytically 
in Sect. 3, a nonlinear social herding by itself will not lead to a transition towards cooperation. 
Instead, it is needed the right level of social herding in combination with the right nonlinearity, 
to enhance cooperation. 
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What do we gain from such insights? First of all, a better understanding of the fact that more 
information does not necessarily lead to a better outcome (in this case, to cooperation). Common 
wisdom would suggest that it is always better to have more information, e.g. to choose among 
more alternatives, to determine their consequences in advance, and thus to reduce the risk asso- 
ciated with making the wrong decision. What seems to be an optimal strategy on the individual 
level, turns out to lead to the lock-in into unfavorable situations on the global level. For exam- 
ple, in experiments on the wisdom of crowd effect, it was shown that more information about 
the guesses of other agents, combined with social influence, leads to a failure in the predictions 
|13j . Also, in a network formation model of agents sharing knowledge it was shown that best 
response, i.e. the choice of partners based on knowing all alternatives, resulted in a worse global 
performance as compared to a situation where just the next best partner was accepted |12j . As 
we point out with this work, to leave the trap of defection also crucially depends on using less 
of the available information, or to have a considerable fraction of less informed agents. 

Second, from our insights we can derive mechanisms to improve the outcome in systems of strate- 
gically interacting agents. Mechanism design can be seen as the engineering part of economics. 
It allows to propose rules, or algorithms, for interactions that avoid the system getting trapped 
in suboptimal states. Some of these algorithms, such as the nowadays famous "Gale-Shapley" 
algorithm [7], are basically related to combinatorial optimization problems. I.e., they propose a 
solution for the agents without involving the agents in finding it, themselves. Systems design, 
the way we see it, aims instead at proposing new ways of interaction at the agent level, in order 
to arrive at more favorable solutions at the system's level. Our paper gives a lucid example of 
this kind of systems design, by proposing a different way of combining information an individual 
agent already has. This still leaves room for the forces of self-organization to act, but restricts 
the possible negative outcome. 

2 Basic Model 

2.1 Combining social herding and strategic interaction 

We consider a system with N agents. Each agent i £ 1 ... N is characterized by two individual 
variables which may change over time: 9i{t) shall describe the agent's strategic behavior when 
interacting with other agents, whereas Ci{t) shall describe how much the agent is prone to social 
influence. We adopt the definition of social influence as the psychological tendency of individuals 
to adhere to and behave according to the expectations of its local neighborhood In this 
sense, our approach belongs to a wider class of models which do not restrict herding behavior to 
perfectly rational agents [16j. 
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In an economic context, 9{ refers to the strategy of a utility maximizing agent, chosen from a 
(discrete) set a of possible strategies. We use the standard game theoretical setting of a Prisoner's 
Dilemma (PD) game, i.e. a G {0, 1}, where the strategic behavior a = refers to defection (D) 
and a = 1 to cooperation (C). 

We assume that each agent plays a 2-person (non-iterated) game with n other agents which are 
located in its neighborhood. The completion of these n games is called a round. From each of 
these interactions the agent receives a payoff which depends both on the strategic behavior of the 
agent itself and on the opponents'. The game structure describing a single interaction between 
two agents can be summarized by the standard payoff matrix of a 2-person game: 





e 3 = i 


e J = o 


9i = 1 


R/R 


S/T 


0i = 


T/S 


P/P 



Suppose, agent i has chosen to cooperate, then its payoff is R if the other agent j has also chosen 
to cooperate (without knowing about the decision of agent i), but S if agent j defects. On the 
other hand, if agent % has chosen to defect, then it will receive the payoff T if agent j cooperates, 
while it will receive P if agent j defects. 

In this paper, we will restrict the discussion to the PD game, but we note that our investigations 
can be extended to other games that result from different values of R, S, T and P |20] , For the 
particular case of the PD game, the payoffs have to fulfill the following two inequalities: 

T>R>P>S; 2R>S + T (1) 

The known standard values are T = 5, R = 3, P = 1, S = 0. This implies that, in a so-called 
one-shot game (no repeated interaction), defection a = 0, is the rational strategy because it 
rewards the higher payoff for an agent i no matter whether the opponent chooses C or D. As 
this argument applies to both agents, one can expect that on the system level a global defective 
behavior emerges. Because of this, the PD game has become a paradigmatic model to study 
different mechanisms of transition towards a global cooperative behavior [TJ [26| , a question that 
has puzzled the scientific community for decades. 

Let us define the degree of cooperation on the system's level by the total number of cooperating 
agents, N\(t) relative to the total population N. Since the number of agents is constant, the 
global frequencies f a of cooperating and defecting agents are given by 

N = ^N a = iV + Ni = const. ; a £ {0, 1}, 

N 

fa = / = /i = i-/o- (2) 

In the following, the variable / shall refer to the global frequency of cooperators. 
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The interaction of each agent with n other agents in a 2-person game results in f^f) different 
possibilities to choose a partner. As the result of these interactions that may occur independently, 
but simultaneously [8] [20] , agent i receives a total payoff Ai(9i) which depends both on its own 
strategy 9i and the strategies of the n different partners. Let us assume that no of these partners 
have chosen to defect, whereas m = n — no partners have chosen to cooperate. Then the total 
payoff from these n interactions reads: 



Ai(0i) = 5i0i niR + n S + Sofa n 1 T + n Q P 



(3) 



where 8 xy means the Kronecker delta, which is 1 only for x = y and zero otherwise. Dividing by 
n gives the scaled total payoff: 

A- 

ai{0i, fi) - 



n 



$1,6 



fiR + (l- h) S 



+ $0,t 



f iT +(l-fi)P 



(4) 



where fi = n\/n = 1 — no/n gives the fraction of cooperating agents agent i interacts with. 
Assuming e.g. that agent i interacts with its neigbors, fi gives the local frequency of cooperators. 
If on the other hand agent i interacts with n randomly chosen agents, the probability to choose a 
cooperator is directly proportional to the global fraction /. I.e. in the so-call mean-field approach 
we set fi = f. 

Strategic considerations imply that agent i pays attention to the scaled payoff cij(0j, fi) expected 
from the interaction with fi cooperators, which of course also depends on its own strategy 0j. 
A nonlinear function G(di) shall consider the way agent i combines the information about the 
different payoffs di(9i, fi) and dj(l — 9%, fi) resulting from its possible strategic choice. This shall 
used below to define the transition rate for an agent to change between strategies, therefore we 
conveniently normalize Q{ai) to one. In a very general way, we assume: 

exp [ft ai(6i,fi)] 



G(<h) 



(5) 



exp [ft ai(0i, fi)] + exp [ft Oj(l-0i, fi)] ' 

Eq. ([5]) has the form of a logit-function well established in decision theory (U [HJ |5U] . The param- 
eter fii allows agents to individually weight differences between the payoffs, ft — > represents 
the limit of random choice between strategies, G(o>i) — > 1/2, whereas ft — > 00 means that even 
small differences in payoff lead to an immediate switch between Q{ai) = and Q{ai) = 1. For 
small values of ft, the Qifli) tends to one if the expected payoff times the a,(#j, fi) from stategy 
6i is much larger than the expected payoff aj(l — 0i, fi) from the opposite strategy 1 — 8i. and 
it tends to zero in the opposite case. If both payoffs become comparable, G{ai) is about 1/2. 
Intermediate values of ft allow for a smooth transition between the two strategic cases. 

We note that for sufficiently small values of ft Eq. ([5]) can be approximated by the linear function 



G(ai) 



1 M&I 
1+ y| a * 



Ji)-ai(l-9i,fi)} 



(6) 
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i.e. agents pay attention to the difference between the two possible payoffs. 

The situation becomes different if the agent is unable to calculate the expected payoff. In our 
model, we assume that the agent then rather pays attention to the action of the majority and 
tends to imitate this without knowing about the consequences. Thus, agent i only responds to 
the information associated with the frequency which shall be described by a logit-function similar 
to Eq. (g: 

■pit \- exp[2ft/cj(/ fl .) /e.-l] 

U6i) exp WMfed Ui " 1] + exp (-[2f3 lKt (fe t ) fe< ~ 1]) ' U 
fg i describes the local frequency of agents playing strategy 0j in the neighborhood of agent i, and 
fl-$ i = 1 — fg i is the local frequency of agents playing the opposite strategy. Both frequencies be- 
ing equal, ^(fg.) = «F(/i_0.) = 1/2. Again, for sufficiently small from a linear approximation 
in Eq. (JTj) we find, 

Wfc) «&*(/*)/<>«. (8) 



Ki(fg.) is a nonlinear response function to consider a weighted influence of the frequency |18| 
as we will investigate below. Hi{f0i) may also depend on the time an agent has kept its current 
strategy, or opinion \2A\ I25j . We emphasize that for the so-called linear voter model, is 
simply a constant n that does not depend on the frequency. So )3iK can be scaled to one, which 
means that for the linear voter model we simply arrive at J~(fei) = Thus, the response of 
agent i is directly proportional to the local frequency of agents playing strategy 0j. 

After having defined the agent's response to strategic information and to social herding, we use 
the individual parameter Q to weight these two different influences. Specifically, we define the 
transition rate for agent i to switch from strategy (1 — 00 to the opposite strategy B% as follows: 

10(0*1(1-00,/^) = (i - COSfoi) + (9) 

For Q — > 0, we cover the limit case of strategic interaction in PD game, for Q — > 1, we arrive 
at the limit case of pure social herding, i.e. imitation behavior without calculating the resulting 
consequences. 



2.2 Specifying the transition rates 

Before describing the system's dynamics by means of a master equation in the following section, 
it will be handy to write down the transition rates of Eq. ([9| more specifically. The transition 
rates apply for a frequency dependent process, i.e. they do not depend on the specific sequence of 
interaction. In this paper, we fix the number of independent, but simultaneous 2-person games to 
n = 4, which is convenient to compare random interactions with local ones on a regular lattice. 
Hence, the relevant frequencies have only discrete values ft = ki/n where ki = n\ = 0, 1, 2, 3, 4 is 
the actual number of cooperating agents, agent i is interacting with. On the other hand, random 
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interactions can be approximated by the so-called mean field approximation, where /, = /, the 
global fraction of cooper ators. 

Dropping the individual index i for the moment, we have to distinguish between two different 
transition rates, Ck(C) = ttj(l|0, k, £), i.e. the transition from defection to cooperation dependent 
on k cooperating agents, and d k (C) = w(0\l, k, £), i.e. the transition from cooperation to defection 
under the same conditions. Both of these rates are comprised of two parts, one resulting from 
strategic behavior (c k , d k ), the other one resulting from social herding (q., df.), 

c k (0 = (l-C)c k + (c k ; d fc (C) = (l-C)4 + C4- (io) 



For the terms (c k , d k ) related to social herding, we use the linear approximation, Eq. (|8j), i.e. for 
the specified neighborhood n = 4, 

c k = ^fiK k ; 4 = 1- "^p^Kfc. (11) 

Again, for the linear voter model with K k = K and the (c k , d k ) would simply result from the set 
of values {0, 1/4, 2/4, 3/4, 1}. In order to use nonlinearities in the frequency response, we rather 



prefer to specify the [c k , d k ) by discrete values ao, ai, «2 as shown in Table (12) 



/ = k/n 


Ck 


dk 


Ck 


dk 





CO 


d 


«0 


l-a 


1/4 


Cl 


di 


«i 


1 — OL\ 


2/4 


C2 


d 2 


0(2 


Q'2 


3/4 


C3 


d 3 


1 — ai 


a i 


1 


C4 


d^ 


l-a 





:i2) 



The parameter ao describes the transition of a cooperator (defector) towards defection (coopera- 
tion) if surrounded by cooperators (defectors) solely based on social herding. Because agents with 
such strategies to follow are absent in the neighborhood, ao should be consequently zero, even 
if there is a strong strategic incentive for a cooperator to switch towards defection if surrounded 
by cooperators. Hence, considering only social herding, pure cooperation and pure defection are 
"absorbing" states for the dynamics of the system. This can be avoided by choosing ao = e, 
a very small value that allows for occasional random changes of the strategies [18] . but in this 
paper we choose ao = 0. 

Possible combinations of (ai, a 2 ) define a parameter space to distinguish between different forms 
of social herding, as shown in Fig. [T] (left). Positive frequency dependence (pf) means that the 
probability to change to the opposite strategy monotonously increases with the frequency of that 
strategy in the neighborhood, also known as "majority voting". Negative frequency dependence 
(nf) means the opposite, i.e. the probability monotonously decreases with the frequency, also 
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known as "minority voting". On the other hand, (pa) and (na) define parameter regions with 
non-monotonous dependence. For example, (pa) means an increase of the probability as long 
as the opposite stragety is not the majority, also known as voting against the trend, while 
(na) describes constellations with a strong amplification of minority strategies. We note that 
the so-called "voter point" that represents the the linear voter model -where a\ = 1/4 and 
CK2 = 2cki = 1/2 are strictly proportional to k- is on the border between the (pf) and (pa) 
parameter regions. For our investigations, we will consider a scenario where the nonlinearity is 
only represented by oc2, whereas a\ is chosen according to the linear voter model. Four possible 
cases which refer to the (pf), (pa), (na) and the linear voter model are shown in Fig. [T] (right) 
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Figure 1: (left) Parameter space (ai,«2) to define the nonlinearity in social herding (see also 



Table 12). The different regions are explained in the text. We use the (pa) region, defined by 
Eq. (18). (right) Linear voter model (red line) and deviations controlled by a% at / = 0.5. 



It remains to specify the payoff related terms (c/c, dk) which follow directly from Eq. (|5j). Here, 
we assume the deterministic limit — > 0, for which we get Q{at) = [ai(9i, fi) — ai(l — 8i, /«)] , 
where Q[y] is the Heavyside function, which is one if y > and zero otherwise. I.e. G{a,i) is 
either one or zero dependent on whether the payoff for the changed strategy is larger or less 
than the payoff resulting from the current strategy. Taking into account the payoff relations, 
Eq. ([!]), we verify that the expected payoffs, Eq. for defectors, a(0,/), are always larger 
than the corresponding ones for cooperators, a(l, /), regardless of the fraction of cooperators in 
the neighborhood. I.e. in non-repeated games as considered here, defection is an evolutionary 
stable strategy. Hence, in the deterministic limit of stategic interaction, we have always c& = 
and d). = 1. This can be rightly assumed as the worst-case scenario because, considering only 
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a strategic point of view, the system will always end up in pure defection. The most important 
thing is to identify conditions where an additional social herding allows not only to avoid this 
trap, but also to let the dynamics to converge to pure cooperation. 

The observant reader may have noticed that we have interpreted (3i differently for social herding 
(where we assumed that it is just small) and for strategic interaction (where /3 — > was assumed). 
This is not a contradiction. In fact, j3 quantifies the randomness in following the different infor- 
mation, and we can assume that the payoff related attention is much higher and less prone to 
errors than the response to the behavior of neighbors. In general, we may distinguish between /3j 
and j3i for the different responses, but this is not applied here. 



2.3 Dynamics to change the strategy 

In the previous section, we have defined the "rules" for agents to change their strategy dependent 
on both strategic information and social herding. Most agent-based models, at this point, would 
continue with extensive computer simulations to probe the parameter space for some non-trivial 
results. We will certainly follow with computer simulations as well, however we are also interested 
in some analytical insights into the model which would allow us to predict the system's dynamics 
without testing every possible parameter combination. For this reason, we need to specify the 
dynamics of agents in a more formal way, on two different levels, (a) on the micro level of the 
individual agent, and (b) on the macro level, describing the fraction of cooperators in the system. 

For the micro level, we use a stochastic approach, i.e. we deal with the probability pi(9i,t) that 
agent i uses strategy 9i at time t. As explained before, this probability depends on the strategies 
of agents in the neighborhood of agent i expressed by the vector #j = {0i 1: 0i 2 , ■ ■ ■ , 9i n }- Hence, 
Pi(0i,t) is defined as the marginal distribution: 

Pi (0 i ,t)=^p(0 i ,0' i ,t). (13) 

Si 

The summation is over all possible distributions 0^. Specific realizations of these distributions 
shall be denoted as a. For n = 4, there are 2 n possible realizations. For the time-dependent 
change of P i(0i,t) we assume the following master equation: 

= ^ [u;(^|(l-^),^:) -^(1-^1^,^) (14) 

This equation considers all possible processes that may lead to an increase or decrease in the 
probability that agent i uses strategy #j given the neigborhood distribution i , with the transition 
rates w(0i\(l — 9i),9' i ), w(l — 0i\0i,0^). Note that these are not the transition rates defined in Eq. 
(|9]), which only depend on the local frequency /j, but not on the neighborhood distribution { . 
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In order to map the two, we have to consider how many specific realizations of the distribution 
£j may lead to the same fi. Taking the example a = {0010}, there are exactly (,) different 
possibilities to realize fi = 1/n. Hence, transforming the master Eq. (14) that depends on the 



neighborhood distribution 9^ into one that only contains the respective local frequency fi results 



in a combinatorial prefactor of (?). Using again the specific notations Cfc, dp., Eq. (10) for the 



transition rates, we can rewrite the master equation ( 14 ) now as 



dt 



k=0 



Cfc(C) p(0, k/n, C, t) - 4(C) p(l, k/n, £ t) 



(15) 



The corresponding master equation for Pi(0,£,t) = 1 — pj(l,£,i) follows likewise. Note that in 



Eq. (15) we have chosen the individual parameter Q to be a constant (. I.e. whereas the local 
frequency fi = k/n changes over time because of concurrent decisions of neighboring agents 
about their strategies, £ is, in this paper, assumed to be a global control parameter the impact 
of which will be discussed together with the computer simulations. 

With this, we have a bottom-up description of the system's dynamics given by N stochastic 



equations, Eq. (15), which are coupled because of the overlapping neighborhoods of agents, 



expressed in terms of /j. On the other hand, on the macroscopic level we have to deal with the 
probability P(f, (, t) to find a given fraction of cooperators, /, at time t, assuming the social 
herding factor £. The dynamics can again be specified by a stochastic equation: 



d 
dt 



P(f, C,t) = J2 [ w (f\f> p U\ C, t) - w(f\f, c) P(f, C, t) 
r 



(16) 



/' denotes all possible deviations from a given value / that can be reached during one time step 
by means of the transition rates W(f'\f,Q. These are not identical with the individual transition 
rates, Eq. Q, but aggregated rates that take into account all possible ways to change /. The 
smallest change of / = N\/N, Eq. ([2]), is the addition or substr action of a single cooperator, 
i.e. /' G {(Ni + l)/iV; (N% — 1)/A^}. The individual equivalent for such processes is given by 
Eq. ( |10p , where the terms Ck(() describe the transition of a single defector into a cooperator, and 
the dk(C) the opposite transition. Hence, we find for the aggregated transition rates 

w(j + i/N\f,o = w + {f,o = jz( n )jf k ^-f) n ' k ^{0 

w(f-i/N\f,o = w-(/,c) = E(^)r~ /c (i-/) fe ^- fc (c). (17) 

k=0 ^ ' 



The combinatorial prefactors preceeding the Ck(C) an d 4(C) result from the various ways to 
choose agents with n = 4 neighbors, k of which could be cooperators given the gobal fraction 
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of cooperators /. Here, we have used the so-called mean-field assumption that replaces the 
frequencies f% of the individual neigborhoods by the global value /. With the specific values for 
Cfc(C) and dk(C) given by Eqs. (10) and (12), the dynamics on the systemic level is also completely 
specified. In the following, we will use the dynamics on the micro level for carrying out computer 
simulations, while the dynamics on the macro level will be used for analytical investigations. 



3 Results of Computer Simulations 



We now use the dynamics specified in Eq. ( 15 ) to run agent-based computer simulations for 



different sets of parameters. According to Eqs. (10), (12), we only need to vary the weight 
< ( < 1 and the parameters < (01,02) < 1 assigned to the social herding of the agents. 
Regarding their strategic decision, everything is already defined, and with Cf~ = 0, = 1 
defection remains the only choice. This "worst case scenario" can be only changed because of 
a considerable amount of social herding, in which agents copy the strategy of their neighbors 
regardless of the payoff assigned to it. This is shown in Fig. [2] Below a critical level for social 
herding, £ ~ 0.7, only defection remains. For ( > 0.7 we observe different levels of cooperation 
which depend on the combination of £ and a<i- If C > 0.8, cooperation even becomes the majority, 
i.e. / > 0.5, but only for large values of £ and ai full cooperation, / — >- 1, is reached. This issue 
is further investigated below. 



0.8 



0.6 



0.4 



0.2 



0.2 0.4 0.6 

C 




Figure 2: Global fraction of cooperation / dependent on the level of social herding £. a\ 
is fixed, 0.2 varies between 0.4 and 1.0 according to the color scale. System size 7V=400. 
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The role of the nonlinearity in social herding, expressed in terms of ct\, «2, is further investigated 
in Fig. [3j given a supercritical level of social herding. We see that there is an optimal nonlinearity 
to enhance cooperation, i.e. a±, 02 have to be chosen such that they belong to the area of "positive 



allee" (pa) effects. This area is defined by the inequalities (see also Eq. (12)) 



< ct\ < a2\ (1 — a\) < a.2 < 



< 1. 



(18) 



It describes a response where the transition toward a given strategy increases with the frequency 
of that strategy as long as that strategy is not the majority, i.e. minority strategies are favored. A 
special case where a\ is taken from the linear voter model, whereas «2 is larger than 0.5 is shown 
in Fig. [T] (right). We note in particular that social herding according to the linear voter model will 
not allow the transition toward cooperation, which will be further substantiated by analytical 
results in the next section. Further, all forms of the transition rates that monotonously increase 
with the frequency, indicated by the (pf) area, will not lead to cooperation. Social herding in 
this case only amplifies defection. 




a 1 



Figure 3: Fraction of cooperation (color scale) dependent on the nonlinearities in social herd- 
ing, defined by a\, «2- Fixed level of social herding £=0.95. The four different areas are de- 
fined in Fig. [ijleft). • indicates the linear voter model. Szstem size iV=400. 



Assuming the right choice of parameters for the transition to cooperation, we can now take a 
look how the dynamics evolve in space. We have chosen a two-dimensional regular lattice with 
Von-Neumann neighborhood, where each agent interacts with n = 4 local neighbors. Initially, 
we assume a small cluster of cooperating agents. Without social herding, this cluster would 
immediately disappear in the next time step because all agents will choose defection, which is 
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the rational choice to maximize their payoff. We observe instead a spreading of cooperation, i.e. 
an invasion of the cooperating strategy into the domain of defectors. The cooperating agents, 
however, do not form compact clusters. A minority fraction of defectors will always survive and 
their spatial distribution in small clusters across the domain of cooperators continues to change 
in time. I.e. we never reach a stationary state in space, despite that the global fraction of both 
strategies, on average, reaches an equilibrium. 

We further note that there is a critical size for the initial cluster of cooperators to grow. This has 
been already discussed in detail for pure PD games on a regular lattice [H [20], and in opinion 
dynamics models [29]. Now, the addition of supercritical social herding of course reduces these 
requirements. Is it worth mentioning that, starting from random initial conditions in a spatially 
extended system, we find that a vanishingly small initial density of cooperators is enough to 
trigger the final state. The reason for this stems from the fact that, if the system is large enough, 
one cluster of cooperators larger than the critical size will appear by chance. This cluster will 
be sufficient to trigger the outbreak of cooperation. Here, however, we will not dig further on 
this discussion. Instead, the initial conditions and parameter constellations for the outbreak of 
cooperation will be further discussed for the mean-field case, in the next section. 




Figure 4: Snapshots of the transition toward cooperation at times i=0, 10, 20, 50, 150, 500. 
iV = 10 4 agents are placed on a regular lattice and interact each with their n=4 spatial neigb- 
hors. Dark color (blue) indicates cooperators, light color (yellow) defectors. Parameters 
ct\ = 0.25, a2 = 0.7, C = 0.95. System is a two-dimensional regular lattice with Von-Neumann 
neighborhood and size iV=400. 



4 Mean-field investigations 
4.1 Calculating the effort 

We verified by means of computer simulations that there is indeed a way of utilizing social herding 
to boost cooperation. Now, we try to illustrate this finding by some analytical considerations. As a 
first step, we want to calculate the "effort" to transfer the system into a majority of cooperators. 
Considering only the strategic dimension, this effort should be very high because there is a 
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strong incentive to defect. On the other hand, social herding may help in this situation because 
it neglects the payoff differences. So, it is particularly important in the first stage of the phase 
transition. 



A formal approach to calculate the effort starts from the master equation (16) on the systemic 
level, in the mean-field limit. The detailed balance condition, which is a specific form of the 
equilibrium condition dP(f, t)/dt = 0, requires that the net probability fluxes are balanced, i.e. 

W(f\f - l/N, C) P°(f - l/N, C) = W(f - 1/N\f, C) P°(f, C), (19) 

where P°(f,C) denotes the equilibrium probability distribution which is independent of t. This 
equation is recursive and, using / = N\/N, Eq. can be re-formulated as: 

P°(f,0 = P>,0U Z rii" r ■ (20) 

The normalization P°(0,£) can be found by enforcing P°{i/N, C) = 1 and the transition 



rates are given by Eq. (17). We visualize the equilibrium probability distribution by means of a 
potential Q(f,() that has its minimum where P°(f,C) has its maximum, i.e. it represents the 
"effort" of reaching a given equilibrium state, 

P Q (/,C) = exp{-n(/,C)}, (21) 

where f2 is given by 



n(/,C) = -inP°(o,c)-y> 



=1 



^(¥ItU) 



(22) 



Figure [2]shows the effort tt{f,() as a function of the global fraction of cooperators / and the level 
of social herding which acts as a control parameter. We observe that for very low values of £ the 
effort is a monotonously increasing function of the frequency /. Given a fraction of cooperators, 
/ = 0.2, and small f, it becomes more and more difficult, or unlikely, to find a larger fraction of 
cooperators (red line). Considering instead a high level of social herding, e.g ( about 0.85. there 
is a monotonous decrease of the effort with an increasing fraction of cooperators. I.e. starting 
from a supercritical level of social herding, the outbreak and the increase of cooperation becomes 
very likely (green line). 

The observant reader will notice in Figure [5] for large £ the nonmonotonous dependence of the 
effort on the fraction of cooperators. I.e. there is a critical region around of / ~ 0.2 below which 
defection becomes the most probable state. This relates to the critical cluster size of cooperators 
in Fig. [4] to allow the transition toward cooperation. However, there is a noticable difference 
underlying both results. Fig. [5] is based on the mean-field limit, i.e. there is no spatial correlation 
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Figure 5: Effort C), Eq. (22) dependent on the globai fraction of cooperators / and the 



level of social herding (. The nonlinearity is specified by ai=0.25, «2=0.85. 

between interacting agents, whereas Fig. [4] assumes a spatial neigborhood defined by the regular 
lattice. In fact, it is known that spatial interaction enhances cooperation |17[ [20| I22j . Already 
small, randomly formed clusters of cooperators are sufficient for the outbreak of cooperation, 
whereas random interaction results in a much larger threshold. 



4.2 Competition dynamics 

Eventually, we can also derive a deterministic dynamics for the global fraction of cooperators, 
f(t), in the mean-field limit. Basically, there are two ways of deriving this. One starts from the 



stochastic dynamics on the microscopicl level, pi(6i,t), Eq. (14) and is discussed in detail in [18] ■ 
The other one starts from the stochastic dynamics on the macroscopic level, P(f,(^,t) , Eq. (16). 
The expected value for the global fraction of cooperators then follows from 

(/(C,i)>=5>(/',C,*), (23) 
/' 



where /' denote all possible realizations of /. Using the master equation (16), we arrive at the 
deterministic dynamics 

^M = lf + (/ 1 C)(l-(/))-^(/,() (/), (24) 

where the aggregated transition rates W+(f, £), W—{f,Q are given by Eq. ( |17| ). Assuming a 
narrow probability distribution in equilibrium, P° (/,£), the expected value (/°(C)) can be ap- 
proximated by the maxima of P°(f, C)- m particular, the deterministic dynamics will converge to 
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those areas where P°(f, C) is largest, or where C) has its minima, shown in Fig. [5] While we 
do not argue about the specific global dynamics at intermediate times (which can be governed by 
stochastic influences in particular in early stages), we can see the late stage of the dynamics as a 
"quasi-stationary" motion along the valley in the potential landscape shown in Fig. [5] provided 
£ chosen large enough. 



We can rewrite Eq. (24) which basically describes the "replication" of cooperators at the global 



scale, to make it more alike to the known replicator equation, 

d(f((,t)) 



dt 



(f) (!-(/» E 1 (f,()-E (f,0 ■ (25) 



The two terms E\ and Eq are the fitness values associated with the two different strategies. The 
fraction of cooperation will grow if the fitness of cooperation E\(f, £) is larger than the fitness of 
defection Eo(f,£), which both depend on the global level of cooperation and the level of social 
herding, 

Ei(f,0 = W+ f ; E (f,C) = ^f-. (26) 

To evaluate the fitness values, one should note the stricly nonlinear depencence of the transion 
rates on /, ( 17 ). Fig. [6]shows the difference E\ — Eq on the whole range of / and £. We emphasize 



that this graph holds for fixed values of the nonlinearity parameters ati, 0,2, i.e. it adds another 
dimension to Fig. [3j which was obtained for a fixed herding level £. Fig. [6] also clearly shows 
the influence of the initial fraction of cooperators, /(0), for the mean-field case. Assuming e.g. 
a fixed value of £=0.85, we see that the fraction of cooperators f(t) can be increased in time 
only if /(0) is between 0.15 and and 0.6. While the lower bound has an intuitive meaning as the 
minimum threshold to start cooperation, the upper bound is less obvious. It results indeed from 
the influence of the nonlinear social herding, which does not simply support cooperation if that 
is the strategy of the majority. We recall that social herding does not assume any "value" related 
to the strategies. Hence, for the example considered, the maximum fraction of cooperators is 
given by / = 0.6. A higher level of social herding, or different values for the nonlinearities, may 
increase this fraction up to about one, i.e. full cooperation. 



Another way of expressing the dynamics of Eq. (25) is through 
d(f((,t)) 



dt 



(f((,y)) (E 1 -(E)); (E) = J> ff </*> = E l </> + S o(l - (/))■ (27) 



As long as E\ is larger than the average fitness, (E), the fraction of cooperators in the system 
is able to grow, but one has to recognize that, because of the time dependence of (f(t)) and its 



implicite feedback on E a , (E(t)) evolves over time as well. Hence, Eq. (27) describes a nonlinear 
selection process for each of the strategies dependent on the parameters describing strategic 
interaction and social herding. 
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Figure 6: Difference of the fitness values E\(f,£) — 2?o(/>C) dependent on the fraction of coop- 
erators, / £ [0.02,0.99], and the level of social herding, £ G [0.05,0.99]. Nonlinearity parame- 
ters: a± = 0.25, «2 = 0.8. 



For some special cases, we are able to derive closed form solutions of the competition dynamics 



expressed by Eqs. (24)-(27). In the absense of any social herding, C = 0, we just have to count in the 
transition rates from strategic interaction, which are c& = 0, dfc = l. This results in E\(f, C = 0) = 
and Eo(f, C = ) = 1 5 i- e - the dynamics reads {f(t)) = /(0) exp{— t}, which means that cooperation 
dies out, exponentially. In the opposite case, £ = 1, i.e. absense of any strategic interaction, 



Eq. (25) can be solved for the case of the linear voter model, which implies q. = k/4 and 



dk = I — (k/4:), Eq. (11). We then find Ei(f, C=l) = Eo(f,£ = l), i- e - the fitness of both strategies, 
which are actually mere labels without any payoff assigned, is the same. This results in the 
dynamics (/(£)) = /(0), i.e. a conservation of the initial fraction of cooperators, on average. This 
is known as one of the puzzles associated with the linear voter model, i.e. individual realizations 
of the dynamics, e.g. using stochastic simulations, always lead to convergence with / — > or 
/ — > 1, but averaging over many runs reveals that the frequency at which cooperators or defectors 
dominate is equal to their initial fraction /(0). 

These two limiting cases allow us to position the dynamics if < £ < 1, i.e. the influence of both 
strategic interaction and social herding at the same time. For social herding, let us first assume 
the case of the linear voter model as described above. We can then verify that the closed solution 
for the dynamics of cooperators is given as: 

</(C,i)> = /(0) exp{(C-l)t} (28) 
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which is similar to the case of only strategic interaction, except that the time scale for the 
extinction of cooperators is stretched by the factor (1 — £). This is an important result because it 
demonstrates that linear social herding will not prevent the extinction of cooperation, not even 
for large £. Hence, in order to turn defection into cooperation, we essentially need a high level of 
nonlinear social herding, i.e. the right £ and a% values. 



Considering a nonlinearity where ct\ = 1/4 but «2 7^ 2/4, we find from Eq. (24) 



d(f((,t)) 
dt 



(/){c[l + 3(/)(l-(/)) 2 (2a 2 -l)] -l} 



(29) 



For «2 = 2/4, the solution reduces to Eq. (28), whereas for ( = 1 we arrive at the mean-field 
equation for the nonlinear voter model, only [18]. In order to make cooperation, (/) = 1, a stable 
fixed point for the full dynamics, the following condition for «2 has to be met: 



1 



+ 



i-C 



6C (/>(!-</»' 



< a 2 < 1 



(30) 



which implies 1/[1 + 3 (/) (1 - (/)) 2 ] < £ < 1. This inequality can be only met for a considerable 
high level of social herding. The feasible range of (/, Q values that is consistent with a given 
value of «2j e -g- «2=0.8, is shown in Fig. [HJ The maximum range resulting from a 2 = 1 is also 



shown in the same Figure by the dashed line. We note again that, even if Eq. (30) is fulfilled, 
the dynamics does not necessarily converge to / — > 1. Dependent on the parameters {C) a i> Q 2} 
also lower equilibrium fractions of cooperators may be reached, i.e. we find a coexistence of 
coooperation and defection. 



5 Conclusions 

In this paper, we have explored a new route towards cooperation. This route differs from many 
other attempts, most of which are rooted in traditional or evolutionary game theory, where 
the transition toward cooperation is induced by specific neighborhood relations, repeated in- 
teractions, discounted payoffs over long time horizons, indirect reciprocity, favorable strategy 
mutations, the enforcement of social norms, etc j9j [191 126H28] All of these propositions either 
improve the payoff of the cooperating strategy or provide, in one or another way, additional 
information agents may consider when making a strategic decision. 

Our approach is much simpler, by not changing payoffs at all, but only counting on the informa- 
tion agents alredy have if they simultaneously play a 2-Person PD game with their n neighbors 
(which can be spatial neighbors, or randomly chosen). This information is the local fraction of 
cooperators, /, = n\/n, and defectors, (1 — /j), of an agent, that also enters the calculation of the 
payoff, Eq. <H§. That means there is no additional information assumed. We argue instead that 
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agents, at the same time, respond to this information in two different ways, as summarized in 
Eq. (J9|. In a strategic interaction, they choose the strategy 9i that will lead to the highest payoff 
a,i(0i, fi), whereas in the case of social herding they simply respond to the local frequency of each 
strategy in a nonlinear manner, J-{f$ ). In some sense, the second way assumes less information 
because no payoff matrix needs to be known. This implies that both strategies are seen as equally 
valuable. 

The parameter Q gives a weight to these two different ways of utilizing the information associated 
with fi. In Eq. ^ we have assumed Ci to be an individual parameter, which means that agents 
dependent on their internal preferences or access to knowledge (such as a known payoff) can give 
different weights to these two responses. In this paper, however, we did not further explore this 
source of heterogeneity, but kept it as a global parameter, constant and the same for all agents. 
This limit case is equivalent of assuming a population of agents, a fraction £ of which only follows 
social herding, whereas a fraction (1 — £) only considers strategic interactions. This allows to 
interpret our main result about a critical £ to turn a population of defectors into cooperators 
in a more general manner: £ can be seen as the minimal fraction of agents following only social 
herding, to enable the transition to cooperation. With respect to the access to information, we 
can interpret this finding as follows: if the information about the payoff matrix is know to all 
agents, they will -in the given Prisoner's Dilemma setting- collectively choose defection (which 
is the suboptimal state). However, if only a small fraction of agents (about 20%) (see Fig. [2j 
has information about the payoff matrix and the majority will just respond to the decision of 
others by means of nonlinear social herding, this can drive the system towards a state where 
cooperation is the dominant strategy. To put it succinctly: less information (or a larger fraction 
of uninformed agents) will lead to more cooperation. 

This interesting and important conclusion still relies on choosing the right nonlinear social herding 
in response to the local (or global) fraction of cooperators. We have demonstrated that the linear 
response, where the probability to choose a strategy is directly proportional to the fraction of 
that strategy in the neighborhood (or the population), fails to enhance cooperation. Instead, we 
have to choose a nonlinearity, expressed in terms of the parameters a\, a%, from the region of 
positive allee (pa) effects (Fig. [2J. As a minimal condition for the transition towards cooperation, 
all transition rates can be (but not necessarily have to be) chosen according to the linear voter 
model, except at2, which has to be above the critical value 0.5 to break the tie in case of an equal 
fraction of cooperators and defectors. Further, the combination of £ and a2 also determines the 
maximum level of cooperation that can be reached using the two different responses. 

Our finding tells that social herding matters most in tie situations, which is also similar to 
another class of group decision models |6] . To design a mechanism that influences social herding 
only in this situation also provides a quite "cost-efficient" solution in that we will not need to 
enforce a decision against the majority, to allow for the transition toward cooperation. Agents 
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can still follow the strategy of the majority -just in the undecided case, we need to ensure that 
the symmetry is broken into the "right" direction. 

Eventually, we wish to point out that in this paper we have discussed a kind of worst-case scenario 
where, in the absense of social herding, defection is the only stable state for the system. Even 
for this case, our proposed mechanism excels in transferring defectors into cooperators, on the 
population level. We can leverage other model ingredients to further facilitate this transition. 
For example, we could count in stochastic changes of the strategy as already considered in the 
strategic component G(di), Eq. which would support random cooperation. We can further 
allow for repeated interaction or "the shadow of the future" which are already known to foster 
cooperation [21 |3l [31]. The important message here is that, even under worst conditions there 
is a way to reach cooperation in a game-theoretical setting by means of social herding, i.e. by 
pure social influence. Including this additional dimension into strategic interaction avoids the 
lock-in into pure defection, which is the suboptimal state compared to pure cooperation. The 
mechanism we have proposed here does not rely on additional information, in fact it uses less 
of the available information, in particular no information about the payoff structure and no 
comparison of alternative strategies. Further, we emphasize again the "cost efficiency" of the 
mechanism proposed in that it does not enforce decisions agains the majority, but influences the 
decisions of agents only in tie situations. 

Summing up, adding social herding to strategic interactions is a way to substantially increase 
the level of cooperation with less, not more: simple rules instead of far-reaching regulations to 
enforce cooperation, no additional information as assumed e.g. in success driven mechanisms, no 
additional costs as in other incentive schemes. Just social herding, the right (nonlinear) way. 
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